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We show that Stochastic Anneahng can be successfuUy applied to gain new results on the Prob- 
abilistic Traveling Salesman Problem (PTSP). The probabilistic "traveling salesman" must decide 
on an a priori order in which to visit n cities (randomly distributed over a unit square) before 
learning that some cities can be omitted. We find the optimized average length of the pruned tour 
follows -E'(Lpj.yjjg(j) = y'np(0.872 — 0.105p)/(np) where p is the probability of a city needing to 
be visited, and f(np) ^ 1 as np 00. The average length of the a priori tour (before omitting 
any cities) is found to follow E i^L^^ priori) ~ \f^P^P) where /3(p) — 1 25-0 82 in(p) measured 
for 0.05 < p < 0.6. Scaling arguments and indirect measurements suggest that /3(p) tends towards 
a constant for p < 0.03. Our stochastic annealing algorithm is based on limited sampling of the 
pruned tour lengths, exploiting the sampling error to provide the analogue of thermal fluctuations in 
simulated (thermal) annealing. The method has general application to the optimization of functions 
whose cost to evaluate rises with the precision required. 

PACS numbers: 02.60.Pn, 02.70.Lq 



I. INTRODUCTION 

Many real systems present problems of stochastic opti- 
mization. These include communications networks, pro- 
tein design P| and oil field models 0, in all of which 
uncertainty plays a central role. We will consider the 
case where the outcome g(x,Lo) depends not only on pa- 
rameters X to be chosen, but also on unknowns uj. We 
can only average with respect to these unknowns, aim- 
ing to find the 'solution' x which optimises the average 
outcome. Thus we seek to find x ^ X which minimizes 



g{x,uj)f[uj)duj 



(1) 



where X is the solution space of the problem and f{Lij) 
is the probability distribution of the uncertain variables. 
K> , Stochastic optimization was borne out of an idea by 
}^ • Robbins & Monro Q . They considered solving the prob- 
ed I lem of finding 



G{x) = a 



(2) 
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where G is some monotonic function of x and a is a 
parameter. G is not known directly, but can only be 
estimated. Their technique to solve this problem is called 
stochastic approximation, and a number of variants of 
this scheme have since been developed 0, 0, IE • 

Stochastic optimization is a more general situation 
since the function to be minimized may have many lo- 
cal minima. We may classify the techniques to solve 
stochastic optimization problems into two classes - exact 
methods and heuristics. Heuristics are more appropriate 
to NP-complete problems, and these are the problems on 
which we focus in this paper. A number of heuristics 
already exist to tackle stochastic optimization problems 
11 mis mill- Many of these are developments from 
simulated annealing 0, 0, 0, 0|, which has been 
shown ^3 to solve stochastic optimization problems with 
probability 1, provided g{x) can be estimated with preci- 
sion greater than 0{t~J) for time step t, where 7 > 1. A 
number of authors 0, 0, 0, 0| have used a modified 
simulated annealing algorithm in which the acceptance 
probability is modified to take some account of the pre- 
cision of the estimates of g{x), and in these cases there 
are a number of convergence results [id . 

Stochastic annealing 1] is a modified simulated anneal- 
ing algorithm which differs from the above approaches 
in two key ways. Firstly the noise present in estimates 
is positively exploited as mimicking thermal noise in a 
slow cooling, as opposed to being regarded as something 
whose influence should be minimised from the outset. 
Secondly, stochastic annealing can be modified to give 



2 



exact simulation of a thermal system. Although this is 
not specifically ruled out by the earlier approaches, no 
attempt has been made with them to satisfy this condi- 
tion. 

In stochastic annealing we estimate g{x) by taking 
r repeated, statistically independent, measurements of 
g{x,uj) each of which we call an instance. For the im- 
plementation of stochastic annealing used in this paper 
we accept all moves for which one estimate (based on 
r instances) for a new state is more favourable than an 
equivalent estimate for the old. This simple procedure 
does not exactly simulate a thermal system, where the 
acceptance probabilities should obey 



Pa- 



B-*A 



(3) 



where f3 = -jp-j; and is the exact difference in g{x) 
between states A and B. However if we assume that 
our estimate change is Gaussian distributed around g{x) 
with standard deviation , where r is the number of 
instances used for each estimate, then it follows that the 
acceptance probability is 



pG 



(4) 



The approximation to a thermal acceptance rule is then 
quite good since 



In 



In 



l-orfl 



l+orf| 



4-TT I 



where 



Pg 



8r 



(5) 



(6) 



identifies the equivalent effective temperature. The small 
coefficient (~ 0.02) of the cubic term in eq. [Slmakes this 
a rather good approximation to true thermal selection. 

Increasing sample size r means that we are more strin- 
gent about not accepting moves that are unfavourable, 
equivalent to lowering the temperature, which is quan- 
tified by eq. El for the Gaussian case. As with standard 
simulated annealing T^, "2^, '2l'| , the question of precisely 
what cooling schedule to use remains something of an art. 



II. PROBABILISTIC TRAVELING SALESMAN 
PROBLEM (PTSP) 

We adopt the PTSP as a good test-bed amongst 
stochastic optimization problems, in much the same way 
as the TSP has been considered a standard amongst de- 
terministic optimization problems. The PTSP falls into 
the class of NP-complete problems 22] , and the TSP is 
a subset of the PTSP. 



The original traveling salesman problem (TSP) is to 
find the shortest tour around n cities, in which each city 
is visited once. For small numbers of cities this is an easy 
task, but the problem is NP-complete: it is believed for 
large n that there is no algorithm which can solve the 
problem in a time polynomial in n. Consideration of the 
traveling salesman problem began with Beardwood et al. 
23] . They showed that in the limit of large numbers of 
cities which are randomly distributed on the unit square, 
the optimal tour length (Ltsp) follows 



-E'(-^TSp) = PtspV^ + a 



(7) 



where Ptsp a-nd a are constants. Here and below E{L) 
denotes the quantity L averaged, after optimization, with 
respect to different city positions, randomly placed on 
the unit square. Numerical simulation p5( gives /?tsp = 
0.7211(3) and a = 0.604(5) as estimates when n > 50. 
Significant divergence from this behaviour is found for 
n < 10, but numerical estimates can be found quickly 
(see appendix). 

The probabilistic trave ling salesman problem (PTSP), 
introduced by Jaillet |2^ l27| . is an extension of the trav- 
eling salesman problem to optimization in the face of un- 
known data. Whereas all of the cities in the TSP must 
be visited once, in the PTSP each city only needs to be 
visited with some probability, p. One first decides upon 
the order in which the cities are to be visited, the 'a pri- 
ori' tour. Subsequently, it is revealed which cities need 
to be visited, and those which do not need to be visited 
are skipped to leave a 'pruned tour'. The order in which 
the cities are to be visited is preserved when pruning su- 
perfluous cities. The objective is to chose an a priori tour 
which minimizes the average length of the pruned tour. 
It is clear from figure ^ that near optimal a priori tours 
may appear very different for different values of p. 

In our terminology, the average pruned tour length is 
averaged over all possible instances of which cities require 
to be visited. This was given by Jaillet as [26| 



(9) 



where 



^pruned = ^ P^(l - P^ L 



(8) 



(9) 



is the sum of the distances between each city and its 
{q + 1)*'* following city on the a priori tour, and the fac- 
tors p^(l — p)' in the preceding equation simply give the 
probability that any particular span skipping q cities oc- 
curs in the pruned tour. Jaillet's closed form expression 
for the average pruned tour length renders the PTSP to 
some extent accessible as a standard (but still NP com- 
plete) optimization problem, and provides some check on 
the PTSP results by stochastic optimization methods. 

It has been conjectured that, in the limit of large 
n, the PTSP strategy is as good as constructing a TSP 
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tour on the cities requiring a visit, the re-optimization 
strategy. This would mean that 



hm ( ^ihl^] ^ ^,,p, 



np 



(10) 



where E{Lpruned) is the pruned tour length further av- 
eraged over city positions after optimisation, which we 
will refer to as the expected pruned tour length. Figure 
121 shows the expected pruned tour length divided by the 
expected re-optimized tour length. Since this quantity is 
tending towards a value significantly greater than 1 for 
p < 1 it demonstrates that the PTSP strategy can be 
worse than the r e-op timization strategy. Jaillet |26l | and 
Bertsimas et al. |23| have also shown that there is a limit 
to how much worse it can be, with 




pruned 



ip) 



where 



cd(p) < Min(0.9212,^). 



(11) 



(12) 



One attempt to solve the PTSP using an exact method 
was taken by Laporte et al. 29] who introduced the use 
of integer linear stochastic programming. Although use 
of algorithms which may exactly solve the PTSP are use- 
ful, they are always very limited in the size of problem 
which may be attempted. Furthermore, the stochastic 
programming algorithm even fails to solve the PTSP on 
certain occasions, thus the accuracy of any statistics that 
would be generated using this method is dubious. 

Three studies have used heuristics to solve the PTSP 
[sol Isil l32j . None of these studies used global search 
heuristics, and all were very restricted in the problem 
size attempted due to computational cost. The evalua- 
tion of a move for the PTSP using equation|Slinvolves the 
computation of O(n^) terms compared to 0(1) computa- 
tions to evaluate a move in the TSP. Thus, to solve a 100 
city problem for the PTSP would take O (10, 000) times 
longer than it would to solve a 100 city problem for the 
TSP. It should be noted, however, that it is only possible 
to make this comparison due to the relative simplicity of 
the PTSP. For many more stochastic optimization prob- 
lems, standard optimization techniques are simply not 
applicable. 



III. FORM OF THE OPTIMAL TOUR & 
SCALING ARGUMENTS 

Optimal a priori PTSP tours for small p, as exempli- 
fied in figure n for p = 0.1, resemble an "angular sort" 
- where cities are ordered by their an gle with respect to 
the centre of the square. Bertsimas [23 proposed that 
an angular sort be optimal as p — > 0, but we can show 



this to be false by comparison to a space-filling curve al- 
gorithm which is generally superior as n — > oo. Such an 
algorithm was introduced by Bartholdi et al. using a 
technique based on a Sierpinski curve. 

For the angular sort with np ^ 1, the probability of 
two cities being nearest neighbours on the pruned tour 
will be vanishingly small for cities which are separated 
from each other by a large angle on the a priori tour. 
This means that only cities that are separated by a small 
angle contribute significantly to eq. |S| Thus for an n city 
tour chosen by angular sort, we may approximate 



(13) 



where Lo is some fraction of the side of the unit square, 
since cities which are sorted with respect to angle will be 
unsorted with respect to radial distance. This leads to 



ri-2 



ii;(Zang)=^ioV^(l-p)«- 

q=0 



(14) 



For np ^ 1 and p ^ 1, we then find that the angular 
sort yields 



LoUp. 



By contrast it has been shown[23| that 



E{L^copt) 



C 



(15) 



(16) 



with probability 1, where E{Lr^f) is the expected length 
of a tour generated by a heuristic based on the Sierpinski 
curve and E{Ljicopt) is the expected length for the re- 
optimization strategy. Using previous computational re- 
sults 25, 2^, we estimate C ~ 1.33, which is worse than 
we achieve using stochastic annealing. Hence, E{Lrgf) is 
given by 



which leads to 



EiLrJ = O(V^) 



E{L 



angj 



np 



(17) 



(18) 



So for large enough np, the angular sort is not optimal. 

From inspection of near-optimal PTSP tours such as 
fig- n propose that the tour behaves differently on 
different length scales; the tour being TSP-like at larger 
length scales, but resembling a locally directed sort at 
smaller length scales. We may construct such a tour and 
use scaling arguments to analyse both the pruned and a 
priori lengths of the optimal tour. Consider dividing the 
unit square into a series of 'blobs', each blob containing 
1/p cities so that of order one city requires a visit. The 
number of such blobs is given by 



TV ~ np 



(19) 
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and for these to approximately cover the unit square their 
typical linear dimension ^ must obey 



N^^ - 1. 



(20) 



Since each blob is visited of order once by a pruned tour, 
we can estimate the expected pruned tour length to be 



np 



(21) 



which we will see below is verified numerically. We can 
similarly estimate the a priori tour length to be n times 
the distance between two cities in the same blob. Thus, 
the expected a priori tour length is 

(ia priori) ,P (22) 



which is more difficult to confirm numerically. 



IV. COMPUTATIONAL RESULTS FOR THE 
PTSP 

We have investigated near optimal PTSP tours for a 
range of different numbers of cities, and various values of 
p. We used stochastic annealing with effective temper- 
atures in the range kT = 0.07 — 0.01, corresponding to 
sample sizes in the range r = 2 — 500. Between 10 and 80 
different random city configurations were optimized (80 
configurations of 30 cities, 40 configurations of 60 cities, 
20 configurations of 90 cities and 10 configurations for 
n > 120 cities). 

Figure 13 shows a master curve for the expected pruned 
tour length divided by ^yrlp. The shift factors have a 
linear fit and the data are consistent with 



-^(-^pruncd) 

np(a — bp) 



= .f{np) 



(23) 



for n > 1, where a = 0.872 ± 0.002, h = 0.105 ± 0.005 
and f{np) — > 1 for large np. The shift factors indicate 



that the PTSP strategy can be no more than qjq^ — 1 = 
14 (±1) worse than the re-optimization strategy. 

The master curve for the a priori tour length is shown 
in fig. ^ Our scaling arguments predict that the shift 
factors /3a priori (p) should tend towards a constant for p 
0. However, data are fit very well by the relation 



a prion 



1 



1.25 - 0.82ln{p) 



(24) 



which would tend to zero as p in conflict with our 
scaling arguments. To resolve this dilemma we need to 
probe very small p. 



V. THE LIMITING CASE p 







We are interested in finding whether /3a priori (p) tends 
towards a constant as p — > 0. To do this using the above 



approach is difficult, since we need a large number of 
cities to produce reliable data for this regime. Extraction 
of this behaviour can however be achieved by comparing 
simulations for different values of n, but fixed np. We 
accomplish this by insisting that each instance has 4 cities 
on the pruned tour. 4 city tours are chosen since they are 
the smallest for which it matters in which order the cities 
are visited. This can be viewed as an efficient way to 
simulate (approximately) the PTSP strategy with p = ^. 

Since we are considering the PTSP at fixed np, if 
/3a priori {p) tends towards a (non-zero) constant as p ^ 

then we expect E plfovs^ *° tend towards a con- 
stant as p — > 0. Simulations in this regime were per- 
formed for = 12 - 210, with 100 different ran- 
dom city configurations used for N < 30, 20 configu- 
rations for < 90 and 10 configurations for N > 120. 



Figure shows a linear-log plot of 



against 



ln(n/4) = ln(l/p). For small n these results reasonably 
match the direct measurements of /^a priori (p), shown for 
comparison. However, for surprisingly large n 100 
which is beyond the range of our /3a priori (p) data, our 
earlier proposal of scaling behaviour is vindicated by 

^ (^a priori) /'"' approaching a constant value. In sum- 
mary we have 



-^(-^a priori) — \ /3a priori (p) 

V p 



where 



« ..(„]} 1.25-0.82 ln{p) 
Ha prion VP^ \ q 

— Po 



p > 0.03 
p < 0.03. 



(25) 



(26) 



VI. NOTES ON ALGORITHM 
IMPLEMENTATION 

We applied stochastic annealing to the PTSP using a 
combination of the 2-opt and 1-shift move- sets [3^ estab- 
lished for the TSP. Both move-sets work similarly to that 
which would be expected for the deterministic case. The 
expected pruned tour length change for the move was es- 
timated by averaging the change in the tour length for a 
number of instances. For a given instance it is not nec- 
essary to decide whether every city is present, but only 
the set of cities closest to the move which determine the 
change in the pruned tour length (see figure EJ. For the 
PTSP, the location of the nearest cities on the pruned 
tour to the move is determined from a simple Poisson 
distribution. 

When using stochastic optimization, the only variable 
over which we have control is the sample size (the num- 
ber of instances) r, whereas the effective temperature 
also entails the standard deviation a of the pruned length 
change over instances. As shown in figure {7\ annealing 
by controlling r alone exhibits a relatively sharp tran- 
sition in the expected pruned tour length. The rapid 
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transition appears to 'freeze in' limitations in the tours 
found (analogous to defects in a physical low tempera- 
ture phase). By comparison we obtain a much smoother 
change when A= is controlled. 

The sharpness of the transition under control by r is 
caused by the fact that a may vary from move to move, 
and is on average lower when the expected pruned tour 
length is less. The jump in the pruned tour length is 
accompanied by a jump in cr and hence the temperature. 
We suggest that quite generally controlling gives a 
better cooling schedule than focussing on r alone. 

VII. CONCLUSION 

We have shown that earlier incompatible ideas about 
the form of PTSP tours especially at small p 22, 30, ^ 
are resolved by a new crossover scaling interpretation. 
The crossover scale corresponds to a group of cities such 
that of order one will typically have to be visited; below 
this scale the (optimsed) a priori PTSP tours resemble 
a local sort whereas they are TSP-like on scales larger 
than the crossover. Our computational results for the 
pruned tour length are summarised by eq. 1231 and clearly 
support the crossover scaling. 

Computationally the a priori tour length is more subtle 
than the pruned tour length, although it does ultimately 
conform to expectations from crossover scaling. We in- 



troduced 4-city tours to probe the behaviour of a priori 
tour length down to very small p. As summarised by eq. 
1251 we find a wide pre- asymptotic regime until recover- 
ing the expected crossover scaling only for p < 0.03. Un- 
derstanding these anomalies in the a priori tour length, 
and confirming them analytically, is left as a future chal- 
lenge. 

We have shown stochastic annealing to be a robust and 
effective stochastic optimization technique, taking the 
PTSP as a representative difficult stochastic optimization 
problem. In this case it enabled us to obtain represen- 
tative results out to unprecedented problem sizes, which 
in turn supported a whole new view of how the tours 
behave. Of relevance to wider applications of stochas- 
tic optimmisation, we have seen that smoother anneal- 
ing can be obtained by directly controlling the effective 
temperature-^ U rather than simply the bare depth of 
sampling r alone. 
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THE LENGTH OF A TSP TOUR FOR SMALL 
NUMBERS OF CITIES 

Numerical estimates of the length of a TSP tour for 
n < 10 are given below 
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FIG. 1: Typical near optimal a priori PTSP tours with n = 300 for p = 0.5(left) and p = O.l(right), respectively. 




FIG. 2: The expected pruned tour length divided by the ex- 
pected re-optimized tour length. This indicates the improve- 
ment one would expect from re-optimization. 



Number of cities n 


Number of instances I 


Average tour length 


a/VI-1 


2 


100000 


1.043 


0.002 


3 


100000 


1.564 


0.002 


4 


5000 


1.889 


0.006 


5 


5000 


2.123 


0.006 


6 


5000 


2.311 


0.005 


7 


5000 


2.472 


0.005 


8 


5000 


2.616 


0.005 


9 


5000 


2.740 


0.005 


10 


5000 


2.862 


0.005 



TABLE I: The length of the optimal TSP tour for n cities. 
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FIG. 3: The master curve for the pruned tour length divided 
by /?pruned(P)V^- "^^^ data follows a smooth curve for n > 
30, and the shift factors follow a linear relationship, suggesting 

that ^^ol^2-o.loip) = finp). Three points with n = 30 
can be seen to fit less well (here and also in fig. 4) , showing 
breakdown of the master curve at small n. 




0.95 I ' ' ' ' ' ' ' ' 1 

20 40 60 80 100 120 140 160 180 

np 

FIG. 4: The master curve for the a priori tour length divided 
by priori '^^e shift factors, inset, are expected to 

tend towards a constant for p ^ 0. The slight, but significant, 
deviation from linear suggests that this might not be the case. 



1 I 1 1 1 1 1 1 1 1 1 

0.5 1 1.5 2 2.5 3 3.5 4 4.5 
ln(l/p) 

FIG. 5: Reciprocal shift factors for a-priori tours (dia- 
monds) compared to estimates from 4 city tours (crosses), 
/3 ' — '(p) — — / Aity \ • ^ ci^y ^^^^ data is optimized 

\ a priori/ 

when each of the instances have 4 cities on the pruned tour. 
The direct measurements do not appear to saturate within the 
accessible range of p. The crosses show matching behaviour, 
with saturation at larger n corresponding to inaccessible p, 
suggesting that E{Lg^ priori) = l^0\/j ^^^^^ P- 







An a-priori tour, and proposed move 


X 

We take one paticular instance and 
estimate length change from that 



FIG. 6: When estimating the expected length change due to 
a move, we randomly generate instances. Only the cities that 
are nearest to the move are needed to calculate the change in 
the pruned tour length. 



0.02 



0.04 



0.06 

-1/2 



0.08 



0.1 



FIG. 7: The expected pruned tour length for anneahngs when 
r and l/T = are increased monotonically. The sharp drop 
in the pruned tour length is scon when only r is controlled, 
demonstrating that this "freezes in" imperfections in the tour. 
The system was annealed at each value of the temperature 
and value of r for 50,000 Monte Carlo steps with n = 300 and 
p = 0.1. 



